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DESIGN  AND  ANALYSIS  OF  ENTOMOLOGICAL  FIELD  EXPERIMENTS 


William  A.  Brown 

Test  Design  and  Analysis  Office,  Dugway  Proving  Ground 

and 

Scott  A.  Krane 

Dugway  Field  Office,  C-E-I-R,  INC. 

Recently  two  entomological  field  experiments  were  conducted  at  Dug- 
way Proving  Ground.  The  purpose  of  the  first  experiment  was  to  com- 
pare the  biting  propensity  of  two  strains  of  a species  of  insect.  In  each 
trial,  four  15-foot  radius  circles  were  scribed,  and  10  hosts,  randomly 
selected,  were  positioned  equidistantly  along  each  circumference.  The 
Number  1 position  in  each  circle  was  oriented  to  true  north.  (See  Figure 
1.  ) At  function  time,  100  individuals  of  the  appropriate  strain  were  re- 
leased at  the  center  of  each  circle.  In  two  of  the  circles,  the  A strain 
was  used;  in  the  other  two  circles,  the  B strain.  The  men  were  seated 
on  the  ground  and  remained  relatively  motionless  throughout  the  trial. 
Sampling  consisted  of  each  man  recording  those  bites  actually  received 
and  entering  the  total  number,  for  5-minute  intervals,  on  a data  card. 
Sampling  was  conducted  for  30  minutes  following  the  release  unless  biting 
activity  continued.  In  that  event,  sampling  in  all  circles  was  extended  for 
additional  5 -minute  periods  until  the  biting  activity  had  essentially  ceased. 

Comparisons  between  strains  were  thus  subject  to  the  variation  found 
among  circles.  This  variation  was  expected  to  be  appreciably  larger 
than  the  variation  among  men  on  a circle.  By  the  nature  of  the  experi- 
mental treatments  (strains),  however,  it  was  necessary  to  separate  the 
strains  either  in  space  or  in  time  sufficiently  that  their  ranges  of  biting 
activity  did  not  overlap.  Only  in  this  manner  could  bites  be  accurately 
attributed  to  one  strain  or  the  other.  The  duplicate  circles  for  each 
strain  represented  an  effort  to  partially  overcome  this  inherent  insen- 
sitivity. 

In  the  analysis  of  the  data,  it  was  considered  useful  to  employ  a 
mathematical  model  to  describe  the  distribution  of  the  number  of  bites 
per  host.  The  simplest  model  which  might  conceivably  fit  the  obser- 
vations is  the  Poisson,  given  by: 


(1) 


f(x ) ^ e"”^  m^/  x! 


X = 0,  1 , . . 


• > 


n. 
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where  x is  the  number  of  bites  received  by  an  individual  host,  f(x)  is 
the  probability  (or  relative  frequency)  of  x bites,  and  m is  an  un- 
known parameter  equal  to  the  "long-run"  average  number  of  bites  per 
host.  If  the  individuals  of  a strain  are  randomly  distributed  throughout 
a given  area,  and  if  hosts  are  equally  attractive,  then  the  Poisson  model 
should  b e appropriate. 

Previous  studies,  however,  have  indicated  that  the  spatial  distribution 
of  insects  released  in  this  manner  is  not  random  (perhaps  being  influ- 
enced by  the  wind  direction,  for  example),  nor  are  all  hosts  equally 
attractive.  As  a result  of  these  tendencies,  the  distribution  of  bites  will 
be  "over-dispersed"  relative  to  the  Poisson  distribution,  i.  e.  , the  num- 
ber of  hosts  receiving  a very  large  number  of  bites  and  the  number  of 
hosts  receiving  a very  small  number  of  bites  will  both  be  larger  than  the 
number  predicted  by  the  Poisson  model,  while  the  number  receiving  near- 
average  numbers  of  bites  will  be  smaller. 

One  of  the  simplest  and  most  frequently  used  "over-dispersed" 
statistical  models  is  the  negative  binomial,  which  has  the  general  term: 


/ -M(k  + x-1)!  p^ 

(2)  f(x)  / x!  (k-1) ! q ‘ x = 0,  1,  2,  . . . , nj 

where  q equals  1 + p,  and  p and  k are  unknown  parameters.  Various 
rationales  may  be  given  for  the  negative  binomial.  ^ One  of  the  simplest 
is  that  the  negative  binomial  is  produced  by  a mixture  of  Poisson  dis- 
tributions in  which  the  parameter,  m,  varies  according  to  a "gamma" 
distribution.  While  no  rationale  appears  to  be  particularly  compelling 
in  the  present  problem,  the  relative  simplicity  of  the  negative  binomial 
model  and  the  success  with  which  other  investigators  have  applied  it  to 
biological  data  are  taken  to  justify  its  use,  at  least  as  a working  hypothe- 
sis. 

For  each  5-minute  time  period,  the  mean  and  variance  of  the  reported 
bites  at  each  circle  were  estimated.  Each  set  of  data  was  then  tested 
for  over -dispersion  with  respect  to  a Poisson  distribution  by  the  ')C^ 
statistic: 


^Bliss,  C.  F.  Fitting  the  Negative  Binomial  Distribution  to  Biological 
Data.  Biometrics,  Vol.  9,  (2)  pp.  176-196. 


■V  ^ = (n-l)s^  / X, 

'^n-1 
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where  n is  the  sample  size  (number  of  hosts,  s is  the  sample  variance 

2 

of  the  number  of  bites,  and  x is  the  sample  mean  number  of  bites. 

The  calculated  statistic  was  tested  for  significance  by  comparison  with 

the  20  per  cent  upper  tail  value  of  the  distribution.  If  the  test  did 

not  indicate  over -dispersion,  the  data  were  subsequently  fitted  to  a 
Poisson  distribution  and  subjected  to  a "goodness-of-fit"  test.  If 

the  test  did  indicate  over -dispersion,  the  data  were  fitted  by  Ihe  method 
of  maximum  likelihood'^  to  a negative  binomial  distribution,  and  then  sub- 
jected to  a goodness-of-fit  test.  All  of  the  above  calculations  were 
performed  on  the  IBM  1620  Computer,  using  a specially  prepared  FORTRAN 
program.  Fifty-five  of  96  sets  of  5 -minute  data  showed  close  agreement 
with  the  Poisson  distribution.  For  each  of  these  sets  of  data,  however, 
the  variance  was  usually  larger  than  the  mean,  and,  consequently,  a fur- 
ther comparison  with  the  negative  binomial  distribution  would  generally 
have  shown  even  closer  agreement.  ^ Therefore,  it  was  decided  that, 
for  the  purposes  of  the  analysis  of  variance,  an  appropriate  transformat- 
ion to  stabilize  variance  for  these  data  would  be  that  derived  for  the 

negative  binomial:  ’ ^ 

2lbid. 

^For  confeni  ence  of  internal  calculation  on  a digital  computer,  the  20  per 
cent  upper  tail  value  was  obtained  from  the  approximation: 

1 X^n-l 

^oge— 1 = -0.038  - 0.452  log  (n-1) 

n-1  ® 

^Fisher,  R.  A.  Notes  on  the  Efficient  Fitting  of  the  Negative  Binomial. 
Biometrics,  Vol.  9(2),  pp.  196-200,  1953. 

^The  Poisson  is,  in  fact,  a limiting  case  of  the  negative  binomial,  from 
which  it  follows  that  a negative  binomial  must  fit  data  at  least  as  well  as 
the  Poisson. 

^Bartlett,  M.  S.  The  Use  of  Transformations.  Biometrics,  March  1947, 
Vol.  3(1)  pp.  39-52. 


Kempthorne,  O.  , Design  and  Analysis  of  Experiments.  Chapter  8.  John 
Wiley  and  Sons,  Inc.  , 1952. 
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(4)  y = XT^  sinh  ^ ( 7v.V  x + 1/2). 


Figure  2,  showing  a plot  of  mean  versus  variance  on  log -log  paper, 
illustrates  the  closer  agreement  with  the  negative  binomial  distribution. 
The  diagonal  line  represents  the  square  root  transformation,  appropri- 
ate for  variance  stabilization  of  Poisson  distributed  data,  and  the  curved 
line  represents  the  transformation  X,  sinh“^  (X.'V^x  + 1/2),  where  A 
has  the  value  1.  0.  (After  several  guesses  of  X- > the  value  of  1.  0 was 
selected  since,  by  eye -fitting,  it  appeared  to  reasonably  minimize  tiie 
deviations  from  the  curve.  Using  the  value  A.  = 1.  0,  47  of  the  data 
points  lie  above  the  curve,  and  49  below.  ) Subsequent  analysis  of  the 
data  of  Experiment  1 using  a method  of  Bliss  and  Owen®  for  the  esti- 
mation of  a common  k,  resulted  in  the  estimate 


k = 0.  51. 
c 


Since 


A 

X 


2 = (1/k), 
-0.  5 

= k : 


the  value  of  A appropriate  for  this  estimate  of 
1.  4. 


k 


is 


As  shown  in  Figure  2,  the  value  of  A = 1.  0 obtained  graphically  agrees 
reasonably  well  with  that  estimated  by  the  method  of  Bliss  and  Owen.  It 
can  easily  be  seen  in  Figure  2 that  the  data  follow  more  closely  to  the 
curved  line.  However,  the  dashed  line  indicates  that  the  logarithmic 
transformation  may  be  as  suitable  as  the  inverse  hyperbolic  sine. 
Furthermore,  analysis  of  logarithmically  transformed  data  permits  in- 
terpretations of  results,  in  terms  of  ratios  of  treatment  effects,  while 
no  such  interpretation  arises  directly  from  the  negative  binomial  trans- 
formation. Therefore,  separate  analyses  of  variance  were  performed, 
using  the  two  transformations. 


An  analysis  of  variance,  based  on  the  three-way  cross  classification 
of  trial,  strain,  and  time  period,  was  performed  on  each  of  the  following 
four  sets  of  data: 

1.  The  values  of  y = A ^ sinh  ^ { 7v^  V x + 1/2),  where  A equals  1.  0 
and  X is  the  total  number  of  bites  received  by  a host  during  a given  time 
period. 


®Bliss,  C.  I.  and  A.  R.  G.  Owen,  "Negative  Binomial  Distributions  With 
a Common  K",  Biometrika  45,  pp.  37-58,  1958. 


Variance  (S 
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2.  The  values  of  y = sinh  (xVx  + 1/2),  where  equals  1.  0 

and  X is  the  total  number  of  bites  received  at  a circle  during  a given 
time  period, 

3.  The  values  of  y = log  (x  + 1),  where  x is  the  total  number  of 
bites  received  by  a host  during  a given  time  period,  and 

4.  The  values  of  y = log  (x  + 1),  where  x is  the  total  number  of 
bites  received  at  a circle  during  a given  time  period. 

The  results  of  each  of  these  analyses  are  presented  in  Table  1. 

As  shown  in  Table  1,  the  results  obtained  in  the  four  analyses  of 
variance  were  essentially  the  same.  Each  analysis  indicated  that  the 
total  numbers  of  bites  obtained  during  the  six  times  periods  were  signi- 
ficantly different,  and  that  no  significant  difference  could  be  detected  be- 
tween strains.  In  every  analysis,  however.  Error  (a)  was  relatively 
large,  so  that  the  F test,  comparing  strain  effects,  was  undoubtedly  in- 
sensitive. As  mentioned  earlier,  the  insensitivity  of  the  analyses  for 
strain  differences  follows  unavoidably  from  the  design  of  these  trials,  in 
which  strain  comparisons  could  only  be  made  between  circles  (rather  than 
within  circles),  and,  hence,  are  subject  to  the  greater  variability  found 
from  circle  to  circle  as  measured  by  Error  (a). 

The  purpose  of  the  second  experiment  was  to  compare  the  dispersal 
of  two  strains  of  a species  of  insect  as  measured  by  their  biting  activity. 

For  this  experiment,  it  was  greatly  desired  that  the  ambient  air 
temperature  and  windspeed  range  of  an  A-B  strain  pair  of  trials  be  as 
similar  as  possible.  However,  because  of  the  small  number  of  men 
available  concurrent  testing  of  the  two  strains  could  not  be  accomplished. 
Therefore,  whenever  possible,  two  trials  were  conducted  each  day--one 
trial  using  the  A strain  and  the  other,  following  as  soon  after  as  practi- 
cable, employing  the  B strain. 

In  each  trial,  four  concentric  circles  were  used,  designated  Circles 
A,  B,  C,  and  D with  radii  equal  to  100,  200,  300  and  400  feet.  Eight 
men  were  positioned  equidistantly  around  each  circumference  of  Circles 
A,  B,  and  D,  and  l6  men  were  positioned  equidistantly  around  the  circum- 
ference of  Circle  C.  (See  Figure  3.)  At  function  time,  1000  individuals 
of  the  appropriate  strain  were  released  at  the  center  of  the  concentric 
configuration,  and  the  men,  seated  and  facing  the  release  point,  recorded 
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biting  activity  for  at  least  30  minutes.  The  sampling  proceedures  were 
the  same  as  those  used  for  the  first  experiment. 

For  each  5 -minute  time  period,  the  mean  and  variance  of  the  reported 
bites  at  each  circle  were  estimated.  Each  set  of  data  was  then  tested  for 
"over- dispersion','  with  respect  to  a Poisson  distribution  in  the  same  manner 
as  in  the  first  experiment. 

The  results  indicated  that  there  was  nearly  always  a departure  from 
the  Poisson  distribution,  in  the  direction  of  higher  variance  and  "over- 
dispersion. " In  addition,  45  of  the  70  sets  of  5 -minute  data  showed 
agreement  with  the  negative  binomial  distribution  at  a nomial  95  per  cent 
confidence  level.  Further,  from  an  examination  of  the  plot  of  the  mean 
versus  the  variance  (see  Figure  4),  it  did  not  appear  that  the  data  would 
fit  any  other  distribution  more  consistently.  Therefore,  it  was  decided 
that,  for  the  purposes  of  the  analysis  of  variance,  a suitable  transfor- 
mation to  normalize  these  data  would  be: 

y = X sinh  ^ (XV  x + 1/2). 


After  several  guesses  of  X,  and,  subsequently,  fitting  the  data  by  eye 

to  7\~^  sinh"^  (XV  X + 1/2),  it  appeared  that  a reasonable  estimate  that 
would  minimize  the  deviations  from  the  curve  was  X =1.0.  Using  this 
value,  42  of  the  data  points  lie  above  the  curve,  and  43  below. 

An  analysis  of  variance,  based  on  the  four-way  cross  classification 
of  day,  strain,  circle,  and  time  period,  was  performed  on  each  of  the 
following  sets  of  data: 

1.  The  values  of  y = "X  ^sinh"^  (X^/~x  + 1/2)  where  X =1.0,  and  x 
is  the  total  number  of  bites  received  during  a given  time  period  at  Circles 
A,  B,  and  D,  and  one -half  the  total  received  at  Circle  C. 

The  results  of  these  analyses  are  presented  in  Table  2. 
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Table  2:  Results  of  the  Analyses  of  Variance  of  Transformed  Bite  Data 


SOURCE  DEGREES 

OF  OF 

VARIATION  FREEDOM 

RESULTS  OF  ANALYSIS  OF  VARIANCE  FOR 
INDICATED  TRANSFORMED  DATA* 

1 

2 

Mean 
S qua  re 

F 

Value 

Mean 

Square 

F 

Value 

Day,  D 

1 

0. 342065 

0.  331695 

Strain,  S 

1 

0. 265038 

0.  241103 

Error  (a) 

1 

0.  271097 

0.  225376 

Circle,  G 

3 

1.  020387 

50.  5 ** 

1.199593 

71.  9 ** 

C X S 

3 

0. 044502 

2.  20 

0.  044751 

2.  68 

Time  Period,  T 

9 

0.  670518 

33.2  ** 

0.  623410 

37.4  ** 

T X S 

9 

0.  115  344 

5.  70** 

0. 108279 

6.  49** 

T X C 

27 

0. 044332 

2. 19** 

0.  043864 

2.  63** 

T X C X S 

27 

0.  011870 

0.  587 

0.  014230 

0.  853 

Error  (b) 

78 

0.  020225 

0.  016677 

Total 

159 

*1  denotes  the  data  resulting  from  the  sinh"^  (T^x  + 1/2)  transfor- 

mation of  total  bites  received  at  a circle  during  a given  time  period;  and  2 
denotes  the  data  resulting  from  the  sinh"^  (7v>(  x + 1/2)  transformation 

of  total  bites  received  during  a given  time  period  for  Circles  A,  B,  and  D, 
and  one -half  the  total  bites  received  at  Circle  C. 

**  Significant  at  the  1.  0 per  cent  level. 

As  shown  by  the  F values  in  Table  2,  the  second  analysis,  adjusting 
for  the  augmented  sampling  on  Circle  C,  was  the  more  sensitive.  Both 
analyses,  however,  showed  circle,  time  period,  T x S,  and  T x C to  be 
highly  significant. 


Using  the  second  set  of  transformed  data,  a further  investigation  of 
circle,  T x S,  and  T x C was  made  in  the  following  way.  From  the 
analysis  of  the  transformed  values: 
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(5)  y..j^  = sinh-1  (X>[ + 1/2)  . X=  1.  0 


= sinh 


V 


+ 1/2, 


where  equals  the  total  number  of  bites  received  by  the  i-th  host  at 

the  j-th  circle  during  the  k-th  time  period.  The  mean  values,  yj^* 

the  transformed  variables  were  obtained.  These  mean  values  are  related 
to  the  estimated  true  average  number  of  bites  received  at  the  j-th  circle 
during  the  k-th  time  period,  ^y 


-1 


(6)  y.,  = sinh  -\/m„  +1/2,  hence 

jk  V Jk 


mjk=(sinh  - 1/2. 


Relationships  between  y and  circle  radius  R,,  were  sought.  The  best 

jk  J 

simple  relationship  found  was: 


(7)  Vjk  = ^ ^ 

where  a and  b are  regression  constants  determined  by  the  method  of 
least  squares. 


(8)  Then, 


(9) 


m^^  = [sinh  (a  - b log^Rj)]  ^ - 1/2. 

= - 1/2, 


(10) 


= [l/2(e^R. 


b 

•e  ^ )1  ^ 


which  can  be  approximated  by: 
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(12)  - 1/2, 

jk  4Rj 


bince  e 'j  is  small  relative  to  el^i  ) 


For  each  5 -minute  time  period,  the  transformed  data  were  summed  with 
respect  to  circle  and  strain.  These  values  were  then  fitted  to  the  above 
regression  model,  and  the  average  values  of  the  various  a's  and  b's 
determined.  Subsequently,  for  each  strain,  the  true  average  number  of 
bites  was  estimated  for  each  circle  during  the  various  time  periods.  These 
latter  values  are  presented  in  Table  3. 
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Table  3;  Estimated  True  Average  Number  of  Bites  of  A and  B Strain  at 
the  Various  Circles  During  Given  Time  Periods. 


TIME 

STRAIN  PERIOD 
(Minutes) 

ESTIIVIATED  TRUE  AVERAGE  NUMBER  OF  BITES  AT 
INDICATED  CIRCLE  DURING  GIVEN  TIME  PERIOD 

Circle  A 
(100  feet) 

Circle  B 
(200  feet) 

Circle  C 
(300  feet) 

Circle  D 
(400  feet) 

0-  5 

123 

15 

4 

1 

5-10 

145 

26 

1.0 

5 

10-15 

119 

28 

12 

7 

15-20 

120 

' 29 

12 

7 

A 20-25 

82 

20 

8 

4 

25-30 

39 

15 

8 

5 

30-35 

22 

10 

6 

4 

35-40 

16 

8 

5 

4 

40-45 

8 

6 

5 

4 

45-  50 

6 

4 

4 

3 

0-  5 

99 

25 

11 

6 

5-10 

132 

35 

16 

9 

10-15 

99 

32 

16 

10 

15-20 

59 

21 

12 

8 

B 20-25 

39 

19 

1 3 

10 

25-30 

19 

9 

5 

4 

30  - 35 

9 

6 

5 

4 

35-40 

2 

2 

2 

2 

40-45 

4 

2 

1 

1 

45-50 

0 

0 

0 

0 

As  shown  in  Table  3,  the  expected  number  of  A strain  bites  at  Circle  A 
during  each  of  the  various  time  periods  is  greater  than  that  for  B strain; 
however,  the  difference,  in  general,  is  not  appreciable.  At  Circles  B,  C, 
and  D,  there  appears  to  be  no  important  difference  between  the  number  of 
bites.  It  was,  therefore,  concluded  that  the  spatial  dispersion  of  the  two 
strains  was  comparable,  as  indicated  by  the  non  significance  of  the  C x S 
interaction. 


Design  of  Experiments 

The  significance  of  the  T x S interaction  indicates  a difference  be- 
tween strains  with  respect  to  the  temporal  dispersion  characteristics. 
Generally  speaking,  the  biting  activity  of  strain  B appeared  to  exhibit 
a more  pronounced  peak  in  time  and  a slightly  earlier  decline. 


Unfortunately,  no  satisfactory  model  for  the  characterization  of  the 
biting  activity  as  a function  of  time  has  been  found  by  the  authors.  It  is 
hoped  that  such  a model  may  yet  be  developed. 


